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stability and translational efficiency of most mRNAs 
in mouse fibroblasts 
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Variation in protein output across tlie genome is controlled at several levels, but the relative contributions of different 
regulatory mechanisms remain poorly understood. Here, we obtained global measurements of decay and translation 
rates for mRNAs with alternative 3' untranslated regions (3' UTRs) in murine 3T3 cells. Distal tandem isoforms had 
slightly but significantly lower mRNA stability and greater translational efficiency than proximal isoforms on average. 
The diversity of alternative 3' UTRs also enabled inference and evaluation of both positively and negatively acting cis- 
regulatory elements. The 3' UTR elements with the greatest implied influence were microRNA complementary sites, 
which were associated with repression of 32% and 4% at the stability and translational levels, respectively. Nonetheless, 
both the decay and translation rates were highly correlated for proximal and distal 3' UTR isoforms from the same 
genes, implying that in 3T3 cells, alternative 3' UTR sequences play a surprisingly small regulatory role compared to 
other mRNA regions. 



[Supplemental material is available for this article.] 

The sequence features of an mRNA constitute a platform for trans 
factors to bind and exert regulatory control over its localization, 
stability, and translational efficiency. The specific combination 
of ds-regulatory sequences present In an mRNA is determined 
through processing of the pre-mRNA Into a mature mRNA com- 
posed of three DNA-encoded regions: the 5' untranslated region 
(5' UTR), the open reading frame (ORF), and the 3' UTR. 

At least three attributes of 3' UTRs favor their utility for 
hosting binding sites for trans-regulatory factors. First, 3' UTRs 
have fewer sequence and length constraints and, therefore, pro- 
vide a more fertile context for the evolutionary emergence and 
malntencmce of regulatory sites. Second, sites within 3' UTRs are 
out of the path of the scanning/translating ribosome, which can 
displace regulatory factors before they exert their effects (Grimson 
et al. 2007). Third, because of mRNA circularization, sites within 
3' UTRs are near key points of regulation of both mRNA degrada- 
tion and translation initiation. Hence, 3' UTR-bound trans factors 
can ciffect the amount of mRNA translated via several mechanisms, 
including (1) mRNA deadenylation, decapping, and degradation, 
(2) endonucleolytic cleavage and degradation (e.g., by SMG6 in 
nonsense-mediated mRNA decay [Huntzinger et al. 2008; Eberle 
et al. 2009]), (3) partial deadenylation that reduces the amount of 
PABP bound to the poly(A) tall, which might decrease translation- 
promoting circularization (Weill et al. 2012), (4) disruption of cap- 
bound initiation factors Qackson et al. 2010), and (5) changes in 
subcellular localization that sequester mRNAs away from ribo- 
somes (Sonenberg and Hinnebusch 2009). For example, the effects 
of the AU-rich element (ARE)-bindrng protein ELAVLl (formerly 
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known as HuR) on mRNA stability are mediated through binding 
to 3' UTRs (Lebedeva et al. 2011; Mukherjee et al. 2011). Most 
known binding sites of the PUF family of regulatory proteins are 
also in the 3' UTR (Quenault et al. 2011), and cytoplasmic mRNA 
localization Is most frequently mediated through ds-regulatory 
elements found in the 3' UTR (Andreassi and Riccio 2009). 

With these attributes, 3' UTRs can be the major drivers of 
tissue-specific expression. This power of 3' UTRs to regulate gene 
expression is illustrated In the Caenorhabditis elegans germllne. In 
which reporter mRNAs controlled by 3' UTR sequence, but not 
promoter sequence, recapitulate the cell-type-speclflc expression 
patterns of 24 different genes (Merrltt et al. 2008). In murine fi- 
broblasts, transcriptional regulation contributes substantially to 
differences in gene expression, but translational regulation is an- 
other major determinant, while mRNA and protein degradation 
rates are less important (Schwanhausser et al. 2011). The degree to 
which this translational regulation is mediated through 3' UTR 
elements Is unknown. 

Alternative cleavage and polyadenylatlon of transcripts from 
the same gene enables cell-type- or condition-specific expres- 
sion of 3' UTR isoforms, thereby adding another layer of post- 
transcriptional regulation. Analysis of expressed sequence tags 
(ESTs) suggests that a majority of mammalian genes contain mul- 
tiple cleavage and polyadenylatlon sites (Tlan et al. 2005), either In 
"tandem UTRs," for which choice of an earlier site by the cleavage 
and polyadenylatlon machinery precludes use of later sites, or in 
alternative last exons, for which the proximal alternative site falls 
within an intron and thus can be spliced out of the primary tran- 
script, allowing cleavage at the distal alternative site. Moreover, 
3' UTR Isoform choice Is Itself regulated. A coordinated shift 
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toward shorter tandem UTR isoforms is observed upon cell pro- 
liferation, and this pattern appears to generalize across tissues: the 
more highly proliferative a tissue, the shorter the average genome- 
wide 3 ' UTR length (Sandberg et al. 2008), with the longest average 
3' UTR lengths occurring in the brain (Ramskold et al. 2009). The 
correlation between 3' UTR length and cell proliferation is also 
observed across mouse embryonic development Qi et al. 2009). 
Additionally, oncogenic transformation can sometimes favor 
3' UTR shortening (Mayr and Bartel 2009), although subsequent 
genome-wide analyses have suggested that UTR shortening is as- 
sociated with transformation in only some cell lines (Shepard et al. 
201 1) . A progressive lengthening of 3 ' UTRs is also observed during 
zebrafish development, with particularly strong preference for 
longer isoforms in the brain (Li et al. 2012; Ulitsky et al. 2012). 

The functional relevance of tandem UTR shortening re- 
mains unclear. An early genome-wide study suggested that genes 
with particularly long 3 ' UTRs (>1 kb) had relatively short mRNA 
half-lives (Yang et al. 2003), but a subsequent higher-resolution 
study did not observe a significant relationship between 3' UTR 
length and stability (Sharova et al. 2009). However, these genomic 
surveys relied on published genome annotations of 3' UTRs and 
compared stabilities for mRNAs of dif- 
ferent genes rather than stabilities of 
different mRNA isoforms from the same 
genes, leaving open the possibility that 
the observed effects (or lack thereof) 
were caused by attributes merely cor- 
related with 3' UTR length. Reporter 
genes have supported the destabilizing 
effect of longer UTRs. For a handful of 
genes, luciferase reporters fused to longer 
tandem 3' UTR isoforms had decreased 
protein production compared with those 
fused to the shorter 3' UTR isoforms 
(Sandberg et al. 2008; Mayr and Bartel 
2009). Some of this down-regulation is 
due to mRNA destabiUzation of the longer 
isoform, as assayed by half-life measure- 
ments performed on three genes across 
several cell lines (Mayr and Bartel 2009). 
Whether shortened 3' UTRs generally 
lead to increased mRNA stability and 
translational efficiency is still an open 
question. 

Multiple methods have been de- 
veloped to annotate or quantify cleav- 
age and polyadenylation events glob- 
ally using high-throughput sequencing 
(Ozsolak et al. 2010; Fu et al. 2011; Jan 
et al. 2011; Shepard et al. 2011; Hoque 
et al. 2012; Martin et al. 2012), most of 
which involve oligo(dT) priming off the 
poly(A) tail. Oligo(dT) priming does well 
at identifying the most distal poly(A) sites 
but for proximal sites yields some false 
positives resulting from internal priming 
off of A-rich regions coded within the 
mRNAs. The 3P-seq (poly [A] -position pro- 
filing by sequencing) method does not 
use oligo(dT) priming and thus lacks these 
internal-priming artifacts and the conse- 
quent false-positive annotation of proxi- 



mal sites Qan et al. 2011). At their 3' end, 3P-seq tags typically 
include residues corresponding to a 3- to 7-nt fragment of the 
poly (A) tail (Fig. lA). These untemplated adenosines provide an 
unambiguous marker of a true poly(A) site, thereby enabling 
confident annotation. 

To explore the role of alternative cleavage and poly- 
adenylation on mRNA stability and translation, we first used 3P- 
seq to produce high-quality annotations of poly(A) sites in mouse 
fibroblasts. We then used a simpler poly(A)-tail-primed sequencing 
method called 2P-seq to quantify the mRNA half-life and trans- 
lational efficiency of each expressed, annotated 3' UTR isoform. 
mRNAs with shorter tandem UTR isoforms tended to be somewhat 
more stable than those with longer isoforms, and we identified 
known and novel regulatory motifs associated with stability dif- 
ferences. However, the differences in stability between short and 
long isoforms were small, with 3' UTR sequences playing a sur- 
prisingly small role in modulating mRNA half -life when compared 
to attributes of the mRNA coding region. Likewise, differences 
between isoform-level translation rates were also generally small, 
with mRNA-specific translational regulation occurring primarily 
through features of the 5' UTR and coding sequence. 
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Figure 1. Experimental methods used to assay alternative cleavage and polyadenylation. (A) 3P-seq 
(jan et al. 201 1 ). This method captures 3' end mRNA fragments without using oligo(dT) priming and, 
therefore, is highly specific for true poly(A) sites. A bridge oligo (black) is used to ligate a biotinylated 
oligo (blue); reverse transcription primer (red); captured mRNA 3' end fragment (gray); lllumina se- 
quencing adapters (purple and green). (B) Poly(A)-primed sequencing, or 2P-seq. This simplified 
method is useful for quantifying usage of known poly(A) sites. (C) Reproducibility of 2P-seq measure- 
ments of isoform abundance across two biological replicates. Only isoforms with at least a single read in 
either replicate were considered, and a single pseudocount was added prior to taking the log of each 
value. (D) Correspondence between mRNA quantified using 2P-seq and RNA-seq data sets from 3T3 
cells. A minimum of two tags or 0.1 RPKM was required for 2P-seq and RNA-seq, respectively, and 
pseudocounts of 1 and 0.1 , respectively, were added prior to taking the log of each value. 
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Results 

3P-seq annotation of murine poly(A3 sites 

To generate a reference set of poly(A) sites in mouse cells, we pre- 
pared 3P-seq libraries using RNA isolated from NIH 3T3 cells and 
embryonic stem cells (Fig. lA). Of the 49 million sequencing reads, 
20 million (41%) mapped uniquely to the mouse genome. These 
reads had an average of approximately four A's at their 3' ends, as 
expected from incomplete RNase H digestion of the poly(A) tail 
Qan et al. 2011). To the extent that these 3'-terminal A's did not 
match the genome, they provided a key indicator that the read 
originated from a true cleavage and polyadenylation site and was 
not a degradation product from within the body of the mRNA. The 
10 million reads that contained at least two 3'-terminal A's were 
used to annotate a total of 33,590 poly(A) sites genome wide 
(Supplemental Fig. SI; Methods). The fraction of informative reads 
(51%) was comparable to that achieved using a modified 3P-seq 
protocol, despite the claim that the modification improves this 
fraction (Hoque et al. 2012). These included at least one site for 
12,759 of 17,945 (71%) RefSeq genes (Pruitt et al. 2009) (using a 
strict definition of nonoverlapping genes). These 3P-seq-annotated 
poly(A) sites typically mapped near 3' ends of RefSeq-annotated 
transcripts, and of those that were distant, more were upstream 
of RefSeq-annotated sites than downstream. The mean 3' UTR 
length was 1209 nt, with a median of 775 nt. We used these 
3P-seq-annotated poly(A) sites for all subsequent analyses. 

2P-seq quantifies alternative cleavage and polyadenylation 

Although 3P-seq has clear advantages for annotating alternative 
poly(A) sites (Jan et al. 2011) and can also be used to quantify the 
usage of these alternative sites (Ulitsky et al. 2012), the method 
is cumbersome compared to other methods. Therefore, we de- 
veloped a more streamlined method, called poly(A)-primed se- 
quencing, or 2P-seq, to more conveniently quantify the alternative 
usage of known 3' UTR isoforms (Fig. IB). For 2P-seq, poly(A)- 
selected mRNA is first fragmented by partial digestion with RNase 
Tl, which does not cleave within the poly(A) tail. mRNA fragments 
are next reverse transcribed using a primer ending in a stretch of 
20 T's that were followed by a VN anchor (N represents a fully 
degenerate base, and V is any base except for T), and the resulting 
cDNAs were circularized, PCR-amplified, and sequenced. The 
lllumina sequencing used a custom primer that ended with 20 T's, 
and thus sequencing began at the next non-A nucleotide of the 
template. Hence, for cDNAs primed at the poly(A) tail, the first 
sequenced base corresponded to the 3' most non-A nucleotide of 
the mRNA. 

Because 2P-seq uses oligo(dT) priming, it is susceptible to 
mispriming at genomically encoded A-rich regions of mRNAs. In- 
deed, genomic nucleotides immediately downstream from mapped 
2P tags were significantly enriched for adenosine. However, this 
enrichment was not detected downstream from 2P tags mapping 
near to 3P-seq-annotated poly(A) sites (Supplemental Fig. S2). 
Therefore, all subsequent analyses were restricted to 2P tags map- 
ping within 20 nt of 3P-seq-annotated poly(A) sites, a range that 
allowed for the considerable microheterogeneity observed at 
cleavage and poly(A) sites (Hart et al. 1985; Jan et al. 2011). 

In mouse 3T3 cells, 2P-seq results for poly(A)-site usage cor- 
related very highly when comparing two biological replicates 
(Pearson rp = 0.94, Spearman = 0.96) (Fig. IC; Supplemental 
Fig. S3). 2P-seq results were also highly correlated with 3P-seq 
results (rs = 0.83) (Supplemental Fig. S3). Combining 2P tags from 



all poly(A) sites of each gene generated an mRNA expression profile 
that correlated well with the results of mRNA-seq (rp = 0.74, rs = 
0.78) (Fig. ID; Supplemental Fig. S3), which further validated the 
highly quantitative nature of 2P-seq. Finally, isoform quantifica- 
tion using a Northern blot agreed closely with that of 2P-seq, while 
demonstrating the increased resolution of 2P-seq for distinguish- 
ing isoforms (Supplemental Fig. S4). 

ORF exon-junction density is most important for predicting 
mRNA stability 

To explore the genome-wide effects of alternative cleavage and 
polyadenylation on mRNA stability, we treated 3T3 cells with ac- 
tinomycin D to block tremscription and prepared 2P-seq libraries 
over an 8-h time course, which provided a measure of the amount 
of each 3' UTR isoform remaining at each time point, as illustrated 
for mRNAs from the Cstfl gene (Fig. 2A,B). Half-lives for each 
3' UTR isoform were generated by fitting these data. 

Two tests supported the ability of our 2P-seq experiment to 
measure half-lives accurately. First, the data set was split into two 
independent replicates, each with three time points (each derived 
from a different plate of cells), and isoform-specific half-lives cal- 
culated from these two biological replicates were compared and 
found to be in good agreement (r^ = 0.76, n = 3192 isoforms) (Sup- 
plemental Fig. S5). Second, half-lives were calculated for each gene 
after combining tags mapping to all of its poly(A) sites, and these 
gene-level half-lives correlated well with stability values previously 
measured in 3T3 cells using metabolic labeling followed by se- 
quencing (rp = 0.62) (Supplemental Table SI; Schwanhausser et al. 
2011). These values also correlated well with half -lives measured 
using actinomycin D and microarrays in mouse ES cells (rp = 0.68) 
(Fig. 2C; Sharova et al. 2009) and mouse neuroblastoma cells (rp = 
0.70) (Clark et al. 2012). Indeed, these were among the best pair- 
wise correlations observed between re-analyzed half-life data sets 
from different cell types (Supplemental Table SI), although biases 
of the actinomycin D and metabolic labeling approaches pre- 
sumably influence these comparisons (Dolken et al. 2008). 

In agreement with previous studies (Yang et al. 2003; Sharova 
et al. 2009), we found that the top gene ontology categories asso- 
ciated with high mRNA stability (top quartile of half -lives) were 
related to cellular metabolism and translation (Supplemental Table 
S2). The top gene ontology categories associated with low mRNA 
stability (bottom quartile) were related to transcription factors, 
cellular signaling pathways, and regulation of various cellular 
processes. These enrichments were significant even after control- 
ling for the different steady-state mRNA levels observed between 
high- and low-stability gene sets. 

To provide context for our novel intragenic analysis, in which 
we measured the influence of the different 3' UTR isoforms of the 
same gene, we began with a more conventional inter-genic anal- 
ysis, considering sequence determinants common to all tandem 
3' UTR isoforms, including in the 5' UTR, ORF, and proximal 
3' UTR. A multivariate model of mRNA stability, which considered 
many potential sequence features, was trained on a subset of the 
data and then tested on the remaining data (Fig. 2D). Our model 
explained over one-third of the variance in half-lives between 
mRNAs of different genes (R^ = 0.39 using MARS, 95% confidence 
interval [0.389,0.409]; linear R^ = 0.35, 95% confidence interval 
[0.329,0.369]), a substantial improvement over the previously 
reported R^ value of 0.20 (Sharova et al. 2009). Our model still 
outperformed the previous model when trained and tested on 
previously published data sets (R^ range, 0.23-0.28) and when 
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Figure 2. mRNA and isoform half-life values, and modeling of associated mRNA features. (A) Schematic of approach used to measure half-lives of 
tandem 3' UTR isoforms. Distal and proximal isoforms are illustrated for cleavage stimulation factor subunit 1 (Cstfl). For each time point after tran- 
scriptional arrest v\/ith actinomycin D, 2P tags are plotted for each isoform as well as a gene-level aggregate, showing the corresponding half-life estimates 
below. Tags for the 0 h (steady-state) replicates were averaged. (6) Fitted exponential degradation curves, shown for the same isoforms as in A. An 
additional normalization factor not considered here was used to correct for the slight decrease in total mRNA at later time points, resulting in slightly lower 
final half-lives of 0.52 h (proximal) and 1 .20 h (distal). (C) Correlation between gene-based half-lives determined in 3T3 cells (2P-seq) with previous results 
in mouse embryonic stem cells (Sharova et al. 2009). (D) Correspondence between our measured half-lives and mRNA half-lives predicted using a model 
trained on a separate 50% of our data. For each gene with multiple tandem 3' UTRs, only the most proximal isoform was used to quantitate gene half-life, 
although results were similar when averaging over all isoforms or using only the most distal isoform. (£) The correspondence between mRNA half-life and 
ORF exon-junction density. A maximum of 0.02 exons/bp was used for ORF exon-junction density, affecting seven of the 7028 genes, (f) Sequence 
features used to predict half-lives, arranged in approximate order of importance in the model. Adding more features resulted in diminishing returns, with 
a model including only the top three to six features performing nearly as well as any of the models with more features. Because of significant multi- 
collinearity, features correlating well were grouped together (e.g., ORF length and ORF exon-junction density) with their pairwise correlation indicated 
(Pearson r). Also reported are each individual R'^ with mRNA stability, and each cumulative R'^ calculated for a linear model including all previous features. 
Predicted targeting for the top 10 mlRNAs or for individual mlRNAs was quantified using context scores (Crimson et al. 2007; Garcia etal. 201 1). Error bars 
show 95% confidence intervals. 
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trained on our 3T3 data and tested on 
other data sets (R^ range, 0.21-0.23), 
which confirmed its generality. For com- 
parison, the correlations between pre- 
viously published half-life data sets 
ranged from = 0.16 to = 0.49 (Sup- 
plemental Table SI), suggesting that our 
model encompasses a large proportion of 
the universal, cell-type-independent de- 
terminants of message stability. 

A single feature dominated our 
model of mRNA stability: The number of 
exon junctions per kilobase of coding 
sequence was positively associated with 
and accounted for 28% of the variation in 
mRNA half -lives (Fig. 2E). Exon-junction 
density has been previously observed 
to be correlated with stability, but the 
magnitude of this relationship (R^ = 0.11 
[Sharova et al. 2009] and R^ = 0.15 [Clark 

et al. 2012]) was substantially lower than that observed using our 
data. Perhaps exon-junction density plays a larger role in deter- 
mining half-life in 3T3 cells compared to either mES or neuroblas- 
toma cells. Alternatively, the increased correlation might indicate 
reduced experimental noise in measuring half-lives or more strin- 
gent filtering for high-quality stability measurements. 

Another important predictive feature was 3' UTR length. As 
reported by an early study (Yang et al. 2003) but not a subsequent 
higher-resolution study (Sharova et al. 2009), 3' UTR length was 
negatively associated with stability. It explained 9% of the varia- 
tion in mRNA half-lives when tested individually, and increased 
the R^ from 0.28 to 0.32 when included in the full model (Fig. 2F). 
Some other sequence features correlated with mRNA stability in- 
dividually but contributed only slightly to the predictive accuracy 
of the full model in the context of the other features (Fig. 2F). Se- 
quence features such as target sites for a specific highly expressed 
microRNA (miRNA) occur more sparsely throughout the expressed 
mRNAs, which helps explain their limited predictive capabilities 
when considering all mRNAs. 

Shorter 3' UTR isoforms are slightly more stable 

For three human genes examined over a number of cell lines, 
proximal 3' UTR isoforms are substantially more stable than distal 
isoforms, typically by 1.5- to 6.6-fold (Mayr and Bartel 2009). To 
explore the extent to which these findings generalize to other 
genes in other species, we compared the half-lives of all tandem 
3' UTR isoforms detected in mouse 3T3 cells. A significantly larger 
number of genes had more stable proximal isoforms (w = 318) 
compared with the number of genes with more stable distal iso- 
forms (« = 215, P = 9.4 X 10"*^ binomial test) (Fig. 3A). Despite this 
detectable shift toward more stable proximal isoforms, the half- 
lives of the proximal and distal tandem UTR isoforms of individual 
genes correlated very well overall (rs = 0.74, rp = 0.72), and the 
correlation increased (to tj = 0.80, rp = 0.79) when restricting the 
analysis to the more highly expressed genes (top 50%), for which 
half-life estimates were the most robust. The high correlation be- 
tween half -lives of tandem 3' UTR isoforms persisted when split- 
ting actinomycin D time points (each derived from a separate plate 
of cells) into independent biological replicates (r^ = 0.66, rj = 0.67), 
as did the significant shift toward more stable proximal isoforms 
(P = 0.01, P = 6 X 10"*, binomial test) (Supplemental Fig. S5). 
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Figure 3. Isoform-specific quantification of mRNA stability. (A) Relationship between half-lives of the 
adjacent proximal and distal tandem 3' UTR isoforms' half-lives, highlighting the statistically significant 
differences (P < 0.05, colored points) and their tallies (colored numbers). (6) Cumulative distribution 
function of data shown in A. Plotted is the fraction of genes with half-life exceeding the value of the x- 
axis. The very slight shift toward more stable proximal tandem UTR isoforms is statistically significant (P = 
4.0 X 1 0"^, Mann-Whitney test, n = 3463 tandem isoform pairs). 

These results indicating that tandem 3' UTR choice plays 
a relatively subtle role in regulating mRNA abundance under the 
conditions studied supported the previous finding that the pri- 
mary determinant of mRNA stability is exon-junction density in 
the coding sequence (Fig. 2E; Sharova et al. 2009; Clark et al. 2012), 
which should be identical across all tandem 3' UTR isoforms of 
each gene. Overall, the median proximal half-life was just 7.5% 
greater than the median distal half-life (Fig. 3B). This observation 
and our detection of more than 200 genes with more stable distal 
than proximal 3' UTR isoforms suggest that stabilizing regulatory 
elements in 3' UTRs (which could offset the effects of destabilizing 
elements) play a more prominent role than currently appreciated. 

As an initial assessment of the biological roles of genes whose 
stability is regulated by alternative cleavage and polyadenylation, 
we performed gene ontology analysis, comparing genes with 
and without significant differences in stability between tandem 
UTR isoforms. Genes with destabilized distal UTR isoforms were 
enriched for cellular localization-related categories, including 
membrane-, adhesion-, and extracellular-associated GO terms 
(Supplemental Table S3). The mRNAs encoding proteins with these 
specific subcellular localizations are themselves often localized, 
with a distinctive distribution in subcellular fractions consistent 
with rough ER or plasma membrane location (Wang et al. 2012), 
raising the possibility that alternative isoforms might often con- 
tribute to mRNA localization. 



Motifs associated with increased or decreased stability 
of alternative isoforms 

Because tandem 3 ' UTR isoforms typically share 5 ' UTR and coding 
sequences, differences in stability should depend largely on the 
sequence unique to the distal 3' UTR isoform. With this in mind, 
we searched the distal regions for motifs associated with increased 
or decreased stability of the distal isoform. To reduce the sequence 
space for this de novo motif search, all 8-nt sequences (allowing up 
to one degenerate position) were first ranked by their excess con- 
servation in 3' UTRs compared to control motifs. For the 901 most- 
conserved motifs (top 0.5%), all distal 3' UTRs containing each 
k-mer were identified. Because requiring the presence of a specific 
motif selects for longer 3' UTRs, non-motif-containing distal 
3' UTRs were sampled to match distal UTR lengths of the motif- 
containing UTRs. Using these length-matched controls, we tested 



2082 Genome Research 



www.genome.org 



Stability and translation of 3' UTR isoforms 



whether the median differences in half-life between the proximal 
and distal isoforms were significantly greater or less for the motif- 
containing compared to the non-motif -containing sets. 

The three motifs most significantly associated with destabi- 
lized distal isoforms perfectly matched (1) the relatively common 
consensus binding motif for the PUF family factors (Fig. 4A; White 
et al. 2001; Hafner et al. 2010); (2) a canonical UAUUUAUU AU- 
rich element (Barreau et al. 2005); and (3) a C-rich motif (Table 1; 



Neff et al. 2012). The fifth and tenth motifs on this list contained 
UG repeats, matching the preferred binding motif of CELF pro- 
teins, which are known to bind and destabilize mRNAs containing 
such motifs (Vlasova-St. Louis and Bohjanen 2011). Also in the top 
20 were matches to let-7 (canonical 8-nt site) (Fig. 4B) and miR-29 
(canonical 8-nt site with one N), the two most highly expressed 
miRNAs in 3T3 cells (Rissland et al. 2011), although these mlRNA 
regulatory motifs were identified after the analysis had reached 



— With UGUAnAUA (n = 579) 

— Without UGUAnAUA 
(sampied, n = 3983) 




B 



Fold-change in half-life 
(log^ proximal - log^ distal h) 



E ° 
o 



■ Wi1hCnUAAUAA{n = 211) 

■ WilhoutCnUAAUAA 
(sampled, n = 4239) 




Fold-change in half-life 
(log^ proximal - log^ distal h) 



0) 

3 



LL o 



r = -0.063 (Pearson) 
n = 3471 tandem UTRs 




Tviean delta 
'half-life - 0.t)7 
p'«0:01 (t-^est vs 0) 



I Mean della 

t expression = -0.15 

I p «0.01 (t-lest vs 0) 



With CUACCUCA (n - 52) 

Without CUACCUCA 

(sampled, n ^ 2950) 






p ^ 4 X 1 0^ 




1 1 1 
-4-2 0 


2 4 


Fold-change in half-life 


(log^ proximal 


- log^ distal h) 



2z O) 




r - 0.247 (Pearson) 
n - 9836 genes 



Expression 
(iog^ RPM, avg of 2 replicates) 



Fold-change in expression 
(log^ proximal - log^ distal) 
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Table 1. 


Top 20 motifs associated with stabilization or destabilization of tandem 3' UTRs 






Rank 


Motif 


P-value 


Q-value 


Relative repression 


Motif annotation 


1 


UGUAnAUA 


14 

*6 X 10 


4X10 


17% 


PUF 


2 


1 1 A A 1 1 A A 

CnUAAUAA 


*3 X 1 0"^ 


8 X 10"^ 


-22% 




3 


1 lAI II II lAI II 1 

UAUUUAUU 


*2 X 1 0"^ 


0.01 


15% 


AU-rich element 


4 


CCCCnCCC 


*8 X 1 0"^ 


0.03 


13% 


C-rich element 


5 


AUAAAGAn 


*2 X 1 0"^ 


0.01 


-1 5% 




6 


A 1 1 » 1 1 A A 1 i 

AUnUAAGU 


2 X 1 0"'' 


0.1 0 


-1 5% 




7 


/-^ 1 1 /" 1 1 « A 1 1 

CUGUGnAU 


8 X 1 0"^ 


0.06 


11% 




8 


UGUGnGUG 


1 X 1 0"'' 


0.08 


9% 


CELF 


9 


/ — A /™| 1 « A 1 1 

CACUGnAU 


4 X lO"'' 


0.20 


-1 9% 




10 


1 l»l II II II 1/^ A 

UnUUUUCA 


6 X 10~^ 


0.21 


-1 2% 




11 


CUACCUCA 


4 X lO""* 


0.20 


32% 


let-7 


12 


CCCACCnC 


2 X 10"' 


0.26 


8% 




13 


CCCAAUnA 


7 X 10"^ 


0.21 


-29% 




14 


ACCAGCCA 


1 X 10"^ 


0.40 


19% 




15 


AUnUAUUU 


4 X 10-" 


0.30 


7% 




16 


AUUUAUUn 


1 X 10"^ 


0.44 


-18% 




17 


UGUGUnUG 


7X10"" 


0.48 


12% 


CELF 


18 


UGGUGCnA 


3X10-^ 


0.47 


15% 


miR-29 


19 


AnUGCCUC 


6X10"" 


0.48 


13% 


MBNL 


20 


UAAAGAnA 


2X10"^ 


0.53 


-1 5% 





Stabilization is indicated by a negative relative-repression value. Motifs significant at the P< 0.05 level after Bonferroni correction are indicated with an 
asterisk (*), and the false discovery rate (Q-value) threshold for each motif is shown. Occurrences of previous top motifs were masked from 3' UTRs for 
subsequent motif discovery, which resulted in P-values that did not necessarily monotonically increase. 



false discovery rates (Q) of 0.20 and 0.47, respectively (Table 1). 
Perhaps most intriguing were the motifs associated with stabili- 
zation, including the second-most-significantly associated motif, 
CNUAAUAA (Table 1), as stabilizing elements within 3' UTRs have 
been far less characterized. A recent report identified several motifs 
associated with stabilization in mammalian cells as measured by 
metabolic labeling (Goodarzl et al. 2012). Neither of the top two 
motifs identified in their analysis was correlated with mRNA sta- 
bilization in our data set whether analyzed at the level of genes (P = 
0.47 and 0.28 for sRSMl and sRSM2, respectively) or isoforms (P = 
0.21 and 0.75). 

Although our unbiased motif search identified some known 
regulatory sequences and some Interesting candidate elements, 
the magnitude of repression or derepression associated with each 
of these was modest, with the let-7 motif associated with the 

strongest regulation (32% repression) (Table 1). Encouraged by our 
success in modeling mRNA stability differences between genes 
(Fig. 2F), we attempted to model the differences between tandem 
UTR isoforms of the same gene, Incorporating 3' UTR sequence 
determincmts (including PUF sites, mlRNA sites, and AU-rich ele- 
ments), 3' UTR length (as well as changes in % length), and se- 
quence conservation. The resulting model performed poorly (R^ < 
0.01) (data not shown), which reinforced the notion that these 
elements play subtle roles in regulating transcriptome-wide mRNA 
stability. 

mRNA steady-state levels are governed by the rates of both 
transcription and degradation, with transcription rates reported to 

be the more influential determinant in 3T3 cells (Schwanhausser 
et al. 2011). Likewise, when comparing different genes, we found 
a significant but small positive correlation between mRNA half-life 
and steady-state levels, with an R^ = 0.06 (Fig. 4D), very similar to 
the previous estimate (Schwanhausser et eil. 2011). With respect to 
the tandem 3' UTR isoforms for the same genes, if differential 
stability played a major role in determining their relative levels, we 
would have expected a strong positive correlation between the 
differences in mRNA levels and half -lives for isoform pairs. In fact, 
we found the opposite — a slight but statistically significant nega- 



tive correlation (rp = -0.06, P = 2.3 X 10 ") (Fig. 4E). This correla- 
tion remained negative, albeit no longer statistically significant, 
after removing all of the most distal 3 ' UTR isoforms, which tend to 
be the most highly expressed. Taken together, these results show 
that differential cleavage and polyadenylation are responsible for 
most of the differences in the steady state levels of tandem 3' UTR 
isoforms, and that differential mRNA stability is only a minor 
contributor. 

Proximal and distal tandem 3' UTR isoforms are often 
translated similarly 

After finding surprisingly little systematic difference with respect 
to mRNA stability (Fig. 3), we combined global polysome proflUng 
(Arava et al. 2003) with 2P-seq to assess the contribution of Isoform 

choice to translational efficiency. After arresting translating ribo- 
somes with cycloheximide, cells were harvested, and cytoplasmic 
lysate was fractionated on a sucrose gradient (Fig. 5A). RNA from 
each of the six gradient fractions was then analyzed by 2P-seq. 
From these data the average number of ribosomes per mRNA iso- 
form was calculated (Fig. 5B), normalizing to tag counts for yeast 
RNA that we had spiked into each fraction as it came off the gra- 
dient. As expected, isoforms of mRNAs with longer ORFs were 
generally bound by more ribosomes {r^ = 0.44). Further evidence 
that our polysome fractionation accurately separated mRNAs 
based on relative translation came from the comparison of our 
translational efficiency values with those obtained by other stud- 
ies. To calculate the translational efficiency for each gene, we 
combined the data for its different isoforms, and normalized for 
ORF length. These values correlated well with previous estimates 
derived from high-throughput proteomics (rs = 0.47) (Schwanhausser 
et al. 2011) and correlated even better with those derived from ri- 
bosome footprint profiling (i^ = 0.56) (Supplemental Fig. S6). 

Recent genome-wide studies of translation using either pro- 
teomics or ribosome footprint profiling provide relative trans- 
lation rates averaged over all copies of an mRNA present in 
the cell. As with global polysome profiling, our complementary 
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approach separated individual mRNAs based on the number of 
bound ribosomes. This enabled detection of cases in which a gene's 
mRNAs spanned distinct subpopulations, some highly translated, 
others less efficiently translated or sequestered away from ribo- 



somes. For example, Dnmtl mRNAs had a bimodal distribution 
across the gradient, with peaks at 2-3 and approximately 12 ribo- 
somes per message, suggestive of just such a partitioning of its 
cellular mRNAs (Fig. 5C). In principle, our approach would also 
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detect differences In elongation rates, with slower elongation 
resulting in more ribosomes bound per message and faster elon- 
gation resulting in fewer. However, with the exception of certain 
stress conditions (Liu et al. 2013; Shalgi et al. 2013), translation is 
not generally thought to be regulated at the elongation step. 
Therefore, we interpreted the isoform-specific differences in bound 
ribosomes as differences in translational efficiency, with fewer 
bound ribosomes presumably reflecting less translational initia- 
tion. This interpretation is analogous to those made previously in 
studies that use global ribosome profiling or ribosome footprint 
profiling to compare translational efficiency at the level of genes 
rather than isoforms (Arava et al. 2003; Hendrickson et al. 2009; 
Ingolia et al. 2009). Very few mRNA isoforms peaked in the pre- 
monosomal fraction, perhaps because ribosomes can continue to as- 
sociate at start codons even after cycloheximide blocks translocation, 
but in any event, this phenomenon would increase the detected 
number of bound ribosomes by no more than one per mRNA. 

Measuring the number of ribosomes at the mRNA level rather 
than at the gene level, when combined with 2P-seq, enabled 
comparison of the relative translational efficiencies for isoforms of 
the same gene. The plots for different tandem 3' UTR isoforms of 
the same gene typically had the same shape and indicated strik- 
ingly similar translational efficiencies (Fig. 5C; Supplemental Fig. 
S7). When considering all genes for which we had translation data 
for tandem 3' UTR isoforms, translational efficiencies of proximal 
and distal isoforms correlated highly (R^ = 0.58, n = 4298 tandem 
UTRs) (Fig. 5D). Moreover, this correlation improved markedly 
when restricting analysis to the most highly expressed genes (R^ = 
0.81, « = 851) (Fig. 5D), suggesting that measurement imprecision 
or biological noise was responsible for much of the increased var- 
iation observed in the more Inclusive analysis. We conclude that 
most translational regulation occurs through regions of the mRNA 
that are in common between the tandem isoforms, with relatively 
little variation in translational efficiency (<20%) remaining to be 
explained by a combination of measurement imprecision, noise, 
and elements unique to the distal isoforms. 

In contrast to our results related to the stability of tandem 
3' UTR isoforms, we found a small but significant genome-wide 



bias toward higher translational efficiencies for longer tandem 
3' UTRs using the fullest set of tandem UTR Isoforms (P < 10 n = 
4298, binomial test) (Fig. 5D) or just the most highly expressed 
isoforms (P= 1 x 10"^, n = 851) (Fig. 5D). Median translation rates 
for proximal and distal isoforms differed by <3%. The increased 
translational efficiency of the distal isoform might be due to re- 
cruitment of trans factors to ds-regulatory motifs in the distal 
Isoform, although a de novo motif search uncovered few candidate 
positive regulatory motifs (see below). Alternatively, increased 
3' UTR length might itself lead to an increase in translational 
efficiency. 

Motifs associated witti translational repression 

We next searched for motifs associated with translational regula- 
tion, applying the same approach used to identify potentially 
stabilizing and destabilizing elements. Unlike the motifs associ- 
ated with differential mRNA stability, the top motifs identified 
(12/12) were all associated with repression (Table 2). These in- 
cluded motifs that matched the let-7 (Q= 10"^) pg. 5E) and miR-29 
sites, which were the first and the fifth most enriched motifs 
from this search, respectively. An AC-rich motif matching hnRNP 
L binding sites was the third most significant repressive motif, and 
hnRNP L has been reported to play a role in regulating translation 
of some mRNAs (Ray et al. 2009). Two 8-mers containing one or 
two occurrences of UGCU, the canonical binding motif of MBNL 
proteins (Goers et al. 2008), occurred lower down on the list, albeit 
with Q-values of 0.4 and 0.7. MBNL proteins regulate pre-mRNA 
splicing, but are also expressed cytoplasmically and have been 
implicated in control of mRNA localization and translation (Wang 
et al. 2012). The PUP motif was not found to be associated with 
differential translation (P = 0.52). 

Although several motifs were highly significant in our de 
novo search, the magnitude of the change potentially mediated by 
the presence of each motif was even smaller than that for the 
motifs potentially regulating stability. For example, the implied 
relative translational repression averaged only 4% for the let-7 and 
miR-29 sites, even when focusing on only those UTRs with perfect 



Table 2. 


Top 20 motifs associated with increased or decreased transiationai efficiency 






Rank 


IVIotif 


P-vaiue 


Q-vaiue 


Reiative repression 


Motif annotation 


1 


CUACCUCn 


*2 X IQ-^ 


1.1 X 10"" 


4.2% 


let-7 


2 


ACCAAACn 


*8 X 10"<* 


6X10"^ 


3.2% 




3 


CCCAnCAC 


1 X 1 Q-'' 


0.054 


2.4% 


hnRNP L 


4 


ACnAGCUG 


2X10-'' 


0.091 


2.4% 




5 


UGGUCCUA 


5 X 1 0"'' 


0.153 


3.8% 


miR-29 


6 


AGnAGCCA 


6X10"'' 


0.305 


2.0% 




7 


UnACACCC 


7X10"" 


0.200 


2.0% 




8 


GCnAAUAA 


2X10"" 


0.150 


2.7% 




9 


UnUCAGGG 


8X10"" 


0.467 


2.8% 




10 


AGCCAAAn 


7 X 10"" 


0.238 


3.2% 




11 


GGUCCUnU 


2 X 10"^ 


0.406 


2.3% 


MBNL 


12 


GCCUCnUU 


2 X 10"^ 


0.422 


2.0% 




13 


UUGUAAAn 


8 X 10"" 


0.413 


-1 .4% 




14 


UCUGGCCU 


4 X 10"^ 


0.575 


-2.1% 




15 


AUUUAUUU 


3 X 10-^ 


0.693 


1 .4% 




16 


CCCAnUAA 


2 X 10"^ 


0.519 


2.5% 




17 


UGCUGCUn 


3x10"^ 


0.700 


1 .2% 


MBNL 


18 


CUUnUCAG 


6x10"^ 


0.662 


-1 .7% 




19 


ACUGUGAC 


4X10"^ 


0.705 


-2.2% 




20 


UCAnCUCA 


2 X 10"^ 


0.802 


2.2% 





Translation activation is indicated by a negative relative-repression value. Otherwise, as in Table 1 . 
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8-nt sites. Most prior measurements of global mlRNA repression 
required comparison of mlRNA-transfected or knockout cells to 
control cells. When evaluated at steady state, which allows the 
miRNAs to exert their full effects, those previous comparisons 
show that miRNAs predominantly act to destabilize mRNA targets, 
with smaller, albeit detectable, effects on translation (Baek et al. 
2008; Guo et al. 2010). Our measurements of 4% translational re- 
pression and 30% mRNA repression, obtained without experi- 
mental perturbation of miRNAs, agree with these previous results. 
Altogether, our results suggest that 3' UTRs play a minor role in 
regulating translational efficiency in 3T3 cells. 

Discussion 

Our large-scale analyses found that the choice of 3' UTR isoform 
caused surprisingly modest changes in mRNA half-lives and 
translation rates in 3T3 cells. This result tempers the inferences 
that our labs had made previously when considering potential 
consequences of the preferential use of shorter isoforms in rapidly 
dividing or transformed cells (Sandberg et al. 2008; Mayr and Bartel 
2009). Our previous studies observed relatively large differences 
in stability (Mayr and Bartel 2009) and translation (Sandberg et al. 
2008; Mayr and Bartel 2009) when using luciferase assays or 
Northern blots to examine a handful of tandem 3' UTR isoforms. 
However, the alternative isoforms previously chosen for experi- 
mental follow-up were selected based on their biological function 
and/or the differential presence of multiple sites to coexpressed 
miRNAs. For this reason and because the majority of distal UTR 
regions do not contain target sites for expressed miRNAs, these 
previously published results were not representative of all alter- 
native isoforms. 

When inspecting our current results, proximal isoforms were 
more highly translated for the two genes studied in Mayr and 
Bartel (2009) for which mouse homologs were evident. This ob- 
servation is in qualitative agreement with this previous human 
study, although only one of these two genes showed increased 
stability of the proximal isoform. With respect to genes studied in 
Sandberg et al. (2008), we found poor agreement in overall ex- 
pression differences between proximal and distal isoforms for four 
out of the five genes with similar isoforms expressed between that 
study and the current one. Differences between what we observed 
now and previously may be attributed to large differences in the 
cell tjrpes studied — 3T3 fibroblasts in this study compared with 
mouse primary lymphoc)^es in Sandberg et al. (2008) — and to 
potential regulatory divergence of orthologs in human (Mayr and 
Bartel 2009) compared to mouse (this study). In summary, the 
previous experiments illustrate the potentially large consequences 
that 3' UTR isoform choice can exert for selected genes in certain 
contexts. Our results concur, highlighting several hundred genes, 
or —15% of tandem UTR-containing genes, with significantly 
different stabilities (Fig. 3). However, we find that these large 
consequences are not the norm in 3T3 cells. 

Although our major finding from the current large-scale study 
might be considered "negative" in the sense that, when comparing 
isoforms from the same gene, we usually did not observe the large 
differences in stability and translation that might have been ex- 
pected, we emphasize that this negative finding is not subject 
to the caveats normally associated with a negative experimental 
result. This is because our negative finding was the outcome of 
positive experimental results, i.e., the strikingly strong and sig- 
nificant correlations that we observed between the short and long 
isoforms, in terms of both their stability (Fig. 3A) and their trans- 



lation (Fig. 5D). To the extent that our methods were not perfect, 
they underreported the correlation between the short and long 
isoforms and thus, if anything, overestimated (not underesti- 
mated) the impact of alternative 3' UTR isoforms on translation 
efficiency /mRNA stability. This being said, the strong correlations 
observed demonstrate that 2P-seq was in fact quite accurate for 
measuring isoform abundance. Because the ends for different 
mRNA isoforms have different sequences and reside on different 
molecules, 2P-seq provided essentially independent measure- 
ments for each isoform. Thus, for each gene with tandem UTR 
isoforms, it independently measured the distal isoform and in- 
dependently measured each of the proximal isoforms. The striking 
correlations that we observe between these independent mea- 
surements for the long and short isoforms, in terms of both their 
half-lives (Fig. 3A) and their translation efficiencies (Fig. 5D), 
proved that 2P-seq was accurate, since any error in these mea- 
surements would tend to reduce these correlations. 

It is formally possible that unknown systematic biases in the 
2P-seq method somehow increase the similarity in expression 
measured for isoforms from the same gene. However, given the 
high correspondence in gene quantification compared to RNA-seq, 
the high correspondence in isoform quantification between 2P-seq 
and 3P-seq, the good agreement with isoform quantification by 
Northern blot, and the high correspondence in measured half -lives 
compared to previous data sets, the 2P-seq method appears to 
be both highly precise as well as accurate in quantif5ring 3' UTR 
isoforms. 

Less than 35% of all 3' UTR sequence is common across all 
tandem UTR isoforms expressed in 3T3 cells. With no reason to 
suspect that regulatory elements more densely populate the re- 
gions of the UTRs shared between isoforms, the surprisingly small 
amount of stability and translational regulation imparted by distal 
isoforms can be extrapolated to the remainder of the 3' UTRs to 
imply that most regulation of message stability and translation 
in 3T3 cells occurs through features of other mRNA regions, i.e., 
either the 5' UTRs or ORFs. This conclusion is surprising when 
considering that 3' UTRs comprise nearly one-third of all mRNA 
sequence and, as outlined in our introduction, have features expected 
to provide a highly favorable location for post-transcriptional 
regulatory elements. 

Our quantitative modeling also indicated a secondary role 
for 3' UTRs in regulating mRNA stability in 3T3 cells, with ORF 
exon-j unction density acting as the primary predictive factor. 
Exon junction complexes (EJCs) are deposited on mRNAs during 
cotranscriptional splicing and serve to identify premature stop 
codons, triggering nonsense-mediated decay (Schoenberg and 
Maquat 2012), and can also function in mRNA localization (Ghosh 
et al. 2012). Perhaps the stabilizing effect of exon junctions is also 
mediated through EJCs or through other protein factors such as SR 
proteins that are deposited near exon junctions in the nucleus 
(Singh et al. 2012). This stabilization might also involve trans- 
lation, because the association is strongest when using exon- 
junction density in the ORF rather than over the entire mRNA. 
However, it is difficult to envision how this stabilizing effect could 
be retained after the pioneer round of translation, during which 
these factors would be displaced from the mRNA by the translating 
ribosome. 

Four models could explain the abundance and sequence 
conservation of alternative 3' UTR isoforms despite the apparently 
subtle differences in stability and translational regulation imparted 
by these isoforms in 3T3 cells. (1) Perhaps most of these regulatory 
differences are biologically important despite their small magni- 
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tudes. (2) Perhaps most of these differences are not Important, but 
a smcill subset (perhaps preferentially those with greatest magni- 
tudes) are biologically Important. In this scenario, emergence of 
a tandem 3' UTR Isoform would typically lead to little change In 
protein output and essentially no change in evolutionary fitness. 
Such a scenario would allow the widespread acquisition of many 
tandem isoforms per gene but would not explain their sequence 
conservation. (3) Perhaps the regulatory differences imparted by 
tandem UTR Isoforms in 3T3 fibroblasts are magnified in other cell 
types or conditions. We suspect that our results will generalize to 
most mammalian cells In culture and perhaps to cells of most adult 
tissues, for which robust transcriptional regulation might domi- 
nate, thereby providing less need for 3' UTR-mediated differences 
in post-transcrlptional gene regulation. For other cells, however, 
regulation occurs in the absence of robust transcription (e.g.. In the 
early embryo, In red blood cells, or under certain stress conditions) 
or far from transcription (e.g., at neuronal synapses); these cells 
might rely more heavily on differences In stability or translation of 
alternate 3' UTR isoforms. (4) Perhaps alternative 3' UTR isoforms 
primarily serve to regulate other aspects of mRNA function such as 
mRNA localization. Localized translation of mRNAs is Important 
In a wide variety of cell types, and the c/s-regulatory elements 
Implicated in mRNA localization have most often been found in 
3' UTRs (Andreassi and Rlcclo 2009). To distinguish between these 
models and to evaluate the spatial and temporal conservation of 
gene regulatory processes operating through alternative 3' UTR 
isoforms and their magnitudes in other contexts, global quanti- 
tative approaches like that described here could be performed in 
other cell types and conditions, perhaps augmented by fraction- 
ation approaches to detect differences in subcellular localization 
(Wang et al. 2012). 

By globally comparing stability and translation of 3' UTR 
isoforms, we were able to estimate the relative contributions of 
these two post-transcrlptional regulatory modes and simulta- 
neously quantify the amount of repression or derepression asso- 
ciated with many conserved ds-regulatory elements. Compared to 
other global approaches, ours more conveniently and potentially 
more accurately measures both the absolute and relative amount 
of regulation associated with each potential element. The conve- 
nience and accuracy came from our ability to simultaneously eval- 
uate all types of regulatory elements, without requiring knockdown, 
knockout, or overexpression of the trans factors and the associated 
indirect effects that can result. An advantage of the alternative 
approach involving trara-factor perturbation Is that factors can 
more confidently be assigned to their corresponding regulatory 
elements, whereas our approach has the advantage of identifying 
and evaluating the relative importance of candidate elements for 
trans factors that have yet to be characterized. Both our approach 
and trflns-factor perturbation provide associations that speak to the 
function of the elements, whereas complementary biochemical 
approaches, such as those involving crossllnklng, provide impor- 
tant information regarding factor occupancy. 

Our de novo searches revealed multiple motifs significantly 
associated with mRNA destablUzatlon and/or translational re- 
pression. The accuracy of our results was Illustrated by our ability 
to Identify expected ds-regulatory motifs, including target sites for 
Pumilio cmd the two most highly expressed miRNAs in mouse fi- 
broblasts. We also discovered several novel motifs associated with 
stabilization of distal 3' UTR isoforms. Competitive binding to 
otherwise destabilizing cis motifs, such as mlRNA sites (Kedde et al. 
2007) and AU-rich elements (Barreau et al. 2005), has been reported. 
However, little is known about cis motifs that act independently 



to stabilize mRNAs — an exception being motifs that mediate cy- 
toplasmic polyadenylation, which occurs during oocyte matura- 
tion and early embryonic development and In some other settings 
(Villalba et al. 201 1). Although further experiments are required to 
validate our candidate mRNA stabilizing elements, their presence 
could help counteract the effects of known destabilizing elements 
and thereby explain the small average difference in stability of 
tandem 3' UTR isoforms. 

Methods 
Cell culture 

Mouse NIH 3T3 cells were plated at a density of 1 to 2 x 10'' cells 
per 15-cm plate and allowed to grow for —48 h before being used 
for downstream experiments. These conditions allowed a sub- 
stantial number of cells to be harvested from each plate while 
preventing large-scale contact inhibition. 3P-seq experiments were 
performed as previously published (Jan et al. 2011) using 60 txg of 
total 3T3 RNA and 16 cycles of PGR amplification. 

Annotation of murine polyadenylation sites 

Mouse 3T3 3P-seq data were used to annotate cleavage and poly- 
adenylation sites as previously described in Jan et al. (2011), with 
the following modifications: (1) Because of the larger 3' UTR size 
and lower gene density in mammals, RNA-seq coverage was not 
required to connect potential poly(A) sites to annotated gene re- 
gions; (2) tags within a ±20-nt window were consolidated into 
a single poly(A)-site annotation; (3) to define an alternative iso- 
form, tags supporting the poly(A) site were required to constitute 
2:10% of all the tags with 3'-terminal adenosines that mapped to 
the genie region; (4) in order to ensure that non-genome-matchlng 
A's were not the result of sequencing errors, each cluster was re- 
quired to include (1) at least one tag with greater than or equal 
to four 3 '-terminal A's, (2) two unique tags with non-genome- 
matching 3'-terminal A's (NGAs), and (3) a minimum percentage 
(10%) of the total NGA tags mapping to the genie region, or three 
unique NGA tags, at least one of which contained multiple NGAs. 
Python scripts used to annotate poly(A) sites from 3P-seq data are 
available online at http://bitbucket.org/nspies/buildpolyadusters. 

Stability measurements and 2P-seq 

Actinomycin D was added to plates at a final concentration of 10 
H,g/mL, and cells were harvested by trypsinization and TRIzol ex- 
traction. Approximately 15 M-g of total RNA for each sample were 
prepared for sequencing using 2P-seq. Briefly, poly(A)-H RNA was 
selected using oligo dT25 beads (Dynal) and eluted directly into 25 
(jlI of RNase Tl sequence buffer. RNase digestion was performed 
for 20 min at 22°C using 0.5 U RNase Tl (biochemistry grade; 
Ambion) and 1 jj.1 glycoblue (15 mg/mL). The reaction was quenched 
and precipitated using the Tl Preclpltatlon/lnactivatlon Buffer. 
Partially digested RNA was reverse transcribed (20-|xl reaction, 
30 min at 48°C, with 250 nM 2P-RT primer, 200 units Superscript 
III Reverse Transcriptase; Invitrogen), then base hydrolyzed. Sin- 
gle-stranded cDNA was purified on a denaturing polyacrylamide 
gel, then circularized (20-n,l reaction, 2 h at 60°C, with 100 units 
CircLigase II; EpiBio). CircLigase was heat-inactivated for 10 min at 
80°C. Circularized cDNA was amplified by high-fidelity PGR for 
12-15 cycles using barcoded PGR primers, then gel-purified and 
sequenced using a custom primer (2P-seq-PEl.l). A detailed, step- 
by-step protocol is included in the Supplemental Material. 

Half-life estimation was performed by fitting a nonlinear 
exponential curve to the library-size-normalized read counts for 
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tactD = 0,0,1,2,4,8 (two biological replicates at t = 0) using SciPy 
(http://www.scipy.org/). Significance of the differences in half- 
lives between isoforms was assessed using a t-test comparing values 
and variances of the exponential parameters; we found this stan- 
dard approach closely matched the bootstrap P-values and was far 
more computationally efficient. 

Translation measurements 

Cells were incubated in the presence of 10 (xg/mL cycloheximide 
for 5-10 min, and all subsequent solutions included cyclohexi- 
mide at this concentration. Cells were harvested by trjrpsinization, 
then quickly washed and lysed using cold solutions. Lysate was 
spun on a 10-mL, 10%-50% sucrose gradient for 2 h, then sepa- 
rated using a gradient master (Biocomp). Fractions were manually 
defined according to the A280 peaks indicative of ribosomal 
subunits or mono- and polyribosomes. Approximately 200 ng 
Saccharomyces castellii total RNA was added to each fraction im- 
mediately following fractionation, and then RNA was prepared 
by GuHCl extraction. Approximately 5-20 (Ag/sample was then 
prepared for sequencing using the 2P-seq protocol and 12-16 PCR 
cycles. 

Stability models 

Multiple linear regression was performed using the core R pack- 
ages, and nonlinear regression was performed using Multivariate 
Adaptive Regression Splines (MARS), using the earth package for 
R (Milborrow 2011). Both dependent and independent variables 
were log-transformed prior to regression analysis. miRNA targeting 
was quantitatively predicted for each mRNA using TargetScan 
context scores (Crimson et al. 2007). Because context scores are 
negative, and a lower negative number indicates greater predicted 
repression, the inverse of the context score was used to reflect the 
predicted repressive effects of miRNA targeting. For miRNAs with 
multiple expressed family members, the most highly expressed 
member was chosen to calculate supplemental pairing. AU-rich 
elements were scored by simple GC content as well as using 
AREScore (Spasic et al. 2012). The pumilio binding motif was 
defined as UGUAnAUA (Hafner et al. 2010). 

Motif analysis 

To reduce the search space for this analysis, enabling the statistical 
analysis of 7- and 8-nt motifs, we ranked all k-mers by their excess 
conservation in 3' UTRs. For a given k-mer at a given position in 
the genome, conservation was defined as the total branch-length 
across all species containing the exact motif in the full-genome 
MULTIZ 30-way vertebrate alignment (Blanchette et al. 2004; 
Kheradpour et al. 2007). Evolutionary conservation above back- 
ground was calculated by a hypergeometric test of the conserved 
and nonconserved occurrences of a given motif, relative to a co- 
hort of shuffled motifs, keeping the number of CpG dinucleotides 
constant. The top 901 (0.5%) most highly conserved motifs were 
used as candidates for subsequent analysis. 

Stability- and translation-associated motifs were identified 
by comparing the half -lives or translation rates of tandem 3' UTRs 
containing a candidate motif to control tandem 3' UTRs. Non- 
motif -containing tandem UTRs were sampled with replacement to 
closely match the 3' UTR length distribution of motif-containing 
UTRs. A Mann-Whitney [/-test was used to assess the significance 
of the difference in delta-stability (or delta-translation) between 
the motif-containing and control tandem UTRs. To increase the 
number of tandem UTRs, all adjacent pairs of tandem UTRs were 
used from genes containing more than 2 UTR isoforms (that is. 



individual comparisons of isoforms 1 and 2, 2 and 3, 3 and 4, etc., 
were performed) after applpng a tag threshold of 10 reads in each 
0-h (actinomycin D control) sample for each isoform to ensure 
reliable measurements. 

The relative repression of a motif was calculated as the 
amount of repression for motif-containing tandem UTRs com- 
pared to the repression in control tandem UTRs: 

Relative repression 

^ ^ Dist expression^„ti£-contaming / Prox expressionnjouj.^.„„taining 
Dist expressionjontroi/P''0''^ expression(.„nfjoi 

For example, control distal isoforms are 94% as stable as 
control proximal isoforms, compeired to let-7-containing distal 
isoforms, which are 64% as stable as the paired proximal isoforms, 
yielding a relative repression of 1 - (64%/94%) = 32%. The effects 
of miRNA targeting were evaluated by using either the experi- 
mentally defined motifs (see Tables 1, 2) or the full 8-nt seed- 
matched sites, which are generally most responsive to the miRNA 
(Crimson et al. 2007; Baek et al. 2008). 

Gene ontology analysis 

A custom CO analysis was performed using annotations from the 
Cene Ontology Consortium (Ashburner et al. 2000). P-values were 
calculated using a hypergeometric test. 

Data access 

Raw 2P-seq and 3P-seq sequencing data have been submitted to 
the NCBl Gene Expression Omnibus (GEO; http://www.ncbi. 
nlh.nih.gov/geo/) under accession number GSE44698. Murine 
poly(A) site annotations and isoform-level stability and translation 
data are available in Supplemental Table S4. 
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